Overview

Dataset statistics

Number of variables13
Number of observations2774
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory281.9 KiB
Average record size in memory104.0 B

Variable types

Numeric12
Unsupported1

Alerts

gross_revenue is highly correlated with invoice_no and 5 other fieldsHigh correlation
invoice_no is highly correlated with gross_revenue and 2 other fieldsHigh correlation
avg_quantity is highly correlated with gross_revenue and 3 other fieldsHigh correlation
total_quantity is highly correlated with gross_revenue and 5 other fieldsHigh correlation
avg_ticket is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_un_basket_size is highly correlated with stock_codeHigh correlation
stock_code is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qty_returns is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_quantity is highly skewed (γ1 = 51.01197811) Skewed
avg_ticket is highly skewed (γ1 = 51.90076423) Skewed
frequency is highly skewed (γ1 = 46.08539806) Skewed
qty_returns is highly skewed (γ1 = 50.10197766) Skewed
df_index has unique values Unique
customer_id has unique values Unique
avg_recency_days is an unsupported type, check if it needs cleaning or further analysis Unsupported
recency_days has 34 (1.2%) zeros Zeros
qty_returns has 1481 (53.4%) zeros Zeros

Reproduction

Analysis started2022-10-10 19:13:47.311269
Analysis finished2022-10-10 19:14:19.660757
Duration32.35 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct2774
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2251.237203
Minimum0
Maximum5696
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-10-10T16:14:19.817760image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile181.65
Q1901.5
median2061.5
Q33411.25
95-th percentile4958.85
Maximum5696
Range5696
Interquartile range (IQR)2509.75

Descriptive statistics

Standard deviation1526.597887
Coefficient of variation (CV)0.6781150763
Kurtosis-0.956310095
Mean2251.237203
Median Absolute Deviation (MAD)1241
Skewness0.3794934938
Sum6244932
Variance2330501.11
MonotonicityStrictly increasing
2022-10-10T16:14:19.986761image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
29101
 
< 0.1%
28961
 
< 0.1%
28971
 
< 0.1%
29001
 
< 0.1%
29011
 
< 0.1%
29051
 
< 0.1%
29061
 
< 0.1%
29071
 
< 0.1%
29081
 
< 0.1%
Other values (2764)2764
99.6%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
56961
< 0.1%
56861
< 0.1%
56801
< 0.1%
56551
< 0.1%
56491
< 0.1%
56381
< 0.1%
56371
< 0.1%
56211
< 0.1%
56201
< 0.1%
56111
< 0.1%

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2774
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15285.69971
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-10-10T16:14:20.145800image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12626.65
Q113815.25
median15242.5
Q316779.75
95-th percentile17950.35
Maximum18287
Range5940
Interquartile range (IQR)2964.5

Descriptive statistics

Standard deviation1714.984904
Coefficient of variation (CV)0.1121953811
Kurtosis-1.206915065
Mean15285.69971
Median Absolute Deviation (MAD)1483.5
Skewness0.01599078757
Sum42402531
Variance2941173.222
MonotonicityNot monotonic
2022-10-10T16:14:20.300801image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178501
 
< 0.1%
144821
 
< 0.1%
170581
 
< 0.1%
177041
 
< 0.1%
169331
 
< 0.1%
137721
 
< 0.1%
162491
 
< 0.1%
141981
 
< 0.1%
139891
 
< 0.1%
179301
 
< 0.1%
Other values (2764)2764
99.6%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123521
< 0.1%
123561
< 0.1%
123581
< 0.1%
123591
< 0.1%
123601
< 0.1%
123621
< 0.1%
123641
< 0.1%
123701
< 0.1%
ValueCountFrequency (%)
182871
< 0.1%
182831
< 0.1%
182821
< 0.1%
182731
< 0.1%
182721
< 0.1%
182701
< 0.1%
182651
< 0.1%
182631
< 0.1%
182611
< 0.1%
182601
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2760
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2904.751532
Minimum36.56
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-10-10T16:14:20.491759image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum36.56
5-th percentile264.557
Q1628.9125
median1170.87
Q32424.715
95-th percentile7579.4915
Maximum279138.02
Range279101.46
Interquartile range (IQR)1795.8025

Descriptive statistics

Standard deviation10927.21927
Coefficient of variation (CV)3.761843017
Kurtosis331.9508666
Mean2904.751532
Median Absolute Deviation (MAD)688.765
Skewness16.26093044
Sum8057780.75
Variance119404120.9
MonotonicityNot monotonic
2022-10-10T16:14:20.652780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1025.442
 
0.1%
745.062
 
0.1%
598.22
 
0.1%
1078.962
 
0.1%
731.92
 
0.1%
1353.742
 
0.1%
2053.022
 
0.1%
379.652
 
0.1%
1314.452
 
0.1%
3312
 
0.1%
Other values (2750)2754
99.3%
ValueCountFrequency (%)
36.561
< 0.1%
521
< 0.1%
52.21
< 0.1%
62.431
< 0.1%
68.841
< 0.1%
70.021
< 0.1%
77.41
< 0.1%
84.651
< 0.1%
90.31
< 0.1%
93.351
< 0.1%
ValueCountFrequency (%)
279138.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
168472.51
< 0.1%
140450.721
< 0.1%
124564.531
< 0.1%
117379.631
< 0.1%
91062.381
< 0.1%
72882.091
< 0.1%
66653.561
< 0.1%

recency_days
Real number (ℝ≥0)

ZEROS

Distinct252
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56.62689257
Minimum0
Maximum372
Zeros34
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-10-10T16:14:20.904760image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q110
median29
Q373
95-th percentile211
Maximum372
Range372
Interquartile range (IQR)63

Descriptive statistics

Standard deviation68.41964137
Coefficient of variation (CV)1.208253504
Kurtosis3.432018391
Mean56.62689257
Median Absolute Deviation (MAD)23.5
Skewness1.898344739
Sum157083
Variance4681.247326
MonotonicityNot monotonic
2022-10-10T16:14:21.072807image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
199
 
3.6%
487
 
3.1%
285
 
3.1%
385
 
3.1%
876
 
2.7%
1067
 
2.4%
966
 
2.4%
765
 
2.3%
1762
 
2.2%
2255
 
2.0%
Other values (242)2027
73.1%
ValueCountFrequency (%)
034
 
1.2%
199
3.6%
285
3.1%
385
3.1%
487
3.1%
543
1.6%
765
2.3%
876
2.7%
966
2.4%
1067
2.4%
ValueCountFrequency (%)
3721
 
< 0.1%
3661
 
< 0.1%
3601
 
< 0.1%
3583
0.1%
3541
 
< 0.1%
3371
 
< 0.1%
3362
0.1%
3341
 
< 0.1%
3332
0.1%
3301
 
< 0.1%

invoice_no
Real number (ℝ≥0)

HIGH CORRELATION

Distinct55
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.053352559
Minimum2
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-10-10T16:14:21.247805image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2
Q12
median4
Q36
95-th percentile17
Maximum206
Range204
Interquartile range (IQR)4

Descriptive statistics

Standard deviation9.071461768
Coefficient of variation (CV)1.498584739
Kurtosis183.9551027
Mean6.053352559
Median Absolute Deviation (MAD)2
Skewness10.62505905
Sum16792
Variance82.29141862
MonotonicityNot monotonic
2022-10-10T16:14:21.417772image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2780
28.1%
3499
18.0%
4393
14.2%
5237
 
8.5%
6173
 
6.2%
7138
 
5.0%
898
 
3.5%
969
 
2.5%
1055
 
2.0%
1154
 
1.9%
Other values (45)278
 
10.0%
ValueCountFrequency (%)
2780
28.1%
3499
18.0%
4393
14.2%
5237
 
8.5%
6173
 
6.2%
7138
 
5.0%
898
 
3.5%
969
 
2.5%
1055
 
2.0%
1154
 
1.9%
ValueCountFrequency (%)
2061
< 0.1%
1991
< 0.1%
1241
< 0.1%
971
< 0.1%
912
0.1%
861
< 0.1%
721
< 0.1%
622
0.1%
601
< 0.1%
571
< 0.1%

avg_quantity
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct2600
Distinct (%)93.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.82792048
Minimum1
Maximum26999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-10-10T16:14:21.589757image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.559587889
Q16.365349265
median10.33333333
Q315.0396017
95-th percentile48.12125542
Maximum26999
Range26998
Interquartile range (IQR)8.674252438

Descriptive statistics

Standard deviation517.9117006
Coefficient of variation (CV)17.36331908
Kurtosis2654.457672
Mean29.82792048
Median Absolute Deviation (MAD)4.267741935
Skewness51.01197811
Sum82742.6514
Variance268232.5296
MonotonicityNot monotonic
2022-10-10T16:14:21.772808image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
108
 
0.3%
9.3333333337
 
0.3%
116
 
0.2%
184
 
0.1%
6.54
 
0.1%
10.44
 
0.1%
94
 
0.1%
124
 
0.1%
12.54
 
0.1%
83
 
0.1%
Other values (2590)2726
98.3%
ValueCountFrequency (%)
11
< 0.1%
1.0526315791
< 0.1%
1.0555555561
< 0.1%
1.1315789471
< 0.1%
1.218751
< 0.1%
1.2571428571
< 0.1%
1.2604166671
< 0.1%
1.281
< 0.1%
1.3761467891
< 0.1%
1.394088671
< 0.1%
ValueCountFrequency (%)
269991
< 0.1%
20001
< 0.1%
1802.81
< 0.1%
1756.51
< 0.1%
1009.51
< 0.1%
7401
< 0.1%
715.21
< 0.1%
664.61538461
< 0.1%
6571
< 0.1%
6001
< 0.1%

total_quantity
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1639
Distinct (%)59.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1700.379957
Minimum2
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-10-10T16:14:21.955815image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile119.65
Q1330.25
median705.5
Q31478.75
95-th percentile4645.5
Maximum196844
Range196842
Interquartile range (IQR)1148.5

Descriptive statistics

Standard deviation6079.161482
Coefficient of variation (CV)3.575178276
Kurtosis437.6447231
Mean1700.379957
Median Absolute Deviation (MAD)453.5
Skewness17.32001834
Sum4716854
Variance36956204.33
MonotonicityNot monotonic
2022-10-10T16:14:22.159773image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31011
 
0.4%
2468
 
0.3%
1508
 
0.3%
2197
 
0.3%
2007
 
0.3%
3007
 
0.3%
4937
 
0.3%
12007
 
0.3%
2727
 
0.3%
2607
 
0.3%
Other values (1629)2698
97.3%
ValueCountFrequency (%)
21
< 0.1%
161
< 0.1%
171
< 0.1%
191
< 0.1%
201
< 0.1%
251
< 0.1%
272
0.1%
301
< 0.1%
321
< 0.1%
332
0.1%
ValueCountFrequency (%)
1968441
< 0.1%
809971
< 0.1%
802631
< 0.1%
773731
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
633121
< 0.1%
583431
< 0.1%
578851
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct2772
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.33677308
Minimum2.150588235
Maximum56157.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-10-10T16:14:22.337759image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.852702153
Q112.42379049
median17.94212763
Q325.07465812
95-th percentile88.42744262
Maximum56157.5
Range56155.34941
Interquartile range (IQR)12.65086763

Descriptive statistics

Standard deviation1071.049203
Coefficient of variation (CV)20.46456325
Kurtosis2718.321218
Mean52.33677308
Median Absolute Deviation (MAD)6.338589039
Skewness51.90076423
Sum145182.2085
Variance1147146.395
MonotonicityNot monotonic
2022-10-10T16:14:22.485760image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14.478333332
 
0.1%
4.1622
 
0.1%
6.2697008551
 
< 0.1%
32.597751
 
< 0.1%
19.030483871
 
< 0.1%
28.554516131
 
< 0.1%
12.800681821
 
< 0.1%
6.3962146891
 
< 0.1%
26.087971011
 
< 0.1%
17.984615381
 
< 0.1%
Other values (2762)2762
99.6%
ValueCountFrequency (%)
2.1505882351
< 0.1%
2.43251
< 0.1%
2.4623711341
< 0.1%
2.5112413791
< 0.1%
2.5153333331
< 0.1%
2.651
< 0.1%
2.6569318181
< 0.1%
2.7075982531
< 0.1%
2.7606215721
< 0.1%
2.7704641911
< 0.1%
ValueCountFrequency (%)
56157.51
< 0.1%
4453.431
< 0.1%
1687.21
< 0.1%
952.98751
< 0.1%
872.131
< 0.1%
841.02144931
< 0.1%
651.16833331
< 0.1%
6401
< 0.1%
624.41
< 0.1%
615.751
< 0.1%

avg_recency_days
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size21.8 KiB

frequency
Real number (ℝ≥0)

SKEWED

Distinct1225
Distinct (%)44.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.04969870057
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-10-10T16:14:22.655759image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.008746355685
Q10.01575839204
median0.0243902439
Q30.04166666667
95-th percentile0.1153846154
Maximum17
Range16.99455041
Interquartile range (IQR)0.02590827462

Descriptive statistics

Standard deviation0.337595074
Coefficient of variation (CV)6.792835026
Kurtosis2296.516337
Mean0.04969870057
Median Absolute Deviation (MAD)0.01069454458
Skewness46.08539806
Sum137.8641954
Variance0.113970434
MonotonicityNot monotonic
2022-10-10T16:14:22.830801image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.062518
 
0.6%
0.0277777777817
 
0.6%
0.0238095238116
 
0.6%
0.0833333333315
 
0.5%
0.0909090909115
 
0.5%
0.0294117647114
 
0.5%
0.0344827586214
 
0.5%
0.0192307692313
 
0.5%
0.0256410256413
 
0.5%
0.0212765957413
 
0.5%
Other values (1215)2626
94.7%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054794520551
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055865921792
0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
0.1%
0.005665722381
 
< 0.1%
0.0056818181822
0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
171
 
< 0.1%
31
 
< 0.1%
21
 
< 0.1%
1.1428571431
 
< 0.1%
18
0.3%
0.751
 
< 0.1%
0.66666666673
 
0.1%
0.5508021391
 
< 0.1%
0.53351206431
 
< 0.1%
0.53
 
0.1%

qty_returns
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct205
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.15897621
Minimum0
Maximum80995
Zeros1481
Zeros (%)53.4%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-10-10T16:14:23.022802image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q39
95-th percentile98
Maximum80995
Range80995
Interquartile range (IQR)9

Descriptive statistics

Standard deviation1564.393524
Coefficient of variation (CV)24.38308116
Kurtosis2586.254065
Mean64.15897621
Median Absolute Deviation (MAD)0
Skewness50.10197766
Sum177977
Variance2447327.097
MonotonicityNot monotonic
2022-10-10T16:14:23.196801image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01481
53.4%
1129
 
4.7%
2117
 
4.2%
382
 
3.0%
472
 
2.6%
663
 
2.3%
555
 
2.0%
1245
 
1.6%
839
 
1.4%
938
 
1.4%
Other values (195)653
23.5%
ValueCountFrequency (%)
01481
53.4%
1129
 
4.7%
2117
 
4.2%
382
 
3.0%
472
 
2.6%
555
 
2.0%
663
 
2.3%
738
 
1.4%
839
 
1.4%
938
 
1.4%
ValueCountFrequency (%)
809951
< 0.1%
90141
< 0.1%
80041
< 0.1%
44271
< 0.1%
37681
< 0.1%
33321
< 0.1%
28781
< 0.1%
20221
< 0.1%
20121
< 0.1%
17761
< 0.1%

avg_un_basket_size
Real number (ℝ≥0)

HIGH CORRELATION

Distinct997
Distinct (%)35.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.12196419
Minimum1
Maximum299.7058824
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-10-10T16:14:23.378802image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.5
Q110.12708333
median17.2967033
Q328
95-th percentile56.6469697
Maximum299.7058824
Range298.7058824
Interquartile range (IQR)17.87291667

Descriptive statistics

Standard deviation18.86759007
Coefficient of variation (CV)0.8528894591
Kurtosis24.17737545
Mean22.12196419
Median Absolute Deviation (MAD)8.296703297
Skewness3.158633785
Sum61366.32865
Variance355.9859551
MonotonicityNot monotonic
2022-10-10T16:14:23.551759image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1344
 
1.6%
1430
 
1.1%
1129
 
1.0%
126
 
0.9%
926
 
0.9%
10.525
 
0.9%
7.525
 
0.9%
9.524
 
0.9%
17.524
 
0.9%
15.523
 
0.8%
Other values (987)2498
90.1%
ValueCountFrequency (%)
126
0.9%
1.21
 
< 0.1%
1.251
 
< 0.1%
1.3333333332
 
0.1%
1.58
 
0.3%
1.5681818181
 
< 0.1%
1.5714285711
 
< 0.1%
1.6666666674
 
0.1%
1.8333333331
 
< 0.1%
221
0.8%
ValueCountFrequency (%)
299.70588241
< 0.1%
203.51
< 0.1%
1451
< 0.1%
136.1251
< 0.1%
135.51
< 0.1%
1221
< 0.1%
1181
< 0.1%
1141
< 0.1%
110.33333331
< 0.1%
1101
< 0.1%

stock_code
Real number (ℝ≥0)

HIGH CORRELATION

Distinct467
Distinct (%)16.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean129.7433309
Minimum2
Maximum7838
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-10-10T16:14:23.725756image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile10
Q134
median72
Q3143
95-th percentile400.05
Maximum7838
Range7836
Interquartile range (IQR)109

Descriptive statistics

Standard deviation277.7854086
Coefficient of variation (CV)2.141038053
Kurtosis336.8230491
Mean129.7433309
Median Absolute Deviation (MAD)45
Skewness15.34866005
Sum359908
Variance77164.73323
MonotonicityNot monotonic
2022-10-10T16:14:23.897770image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2838
 
1.4%
3534
 
1.2%
2730
 
1.1%
2630
 
1.1%
2930
 
1.1%
1527
 
1.0%
1927
 
1.0%
2527
 
1.0%
3127
 
1.0%
3326
 
0.9%
Other values (457)2478
89.3%
ValueCountFrequency (%)
211
0.4%
313
0.5%
416
0.6%
516
0.6%
624
0.9%
714
0.5%
813
0.5%
919
0.7%
1019
0.7%
1123
0.8%
ValueCountFrequency (%)
78381
< 0.1%
56731
< 0.1%
50951
< 0.1%
45801
< 0.1%
26981
< 0.1%
23791
< 0.1%
20601
< 0.1%
18181
< 0.1%
16731
< 0.1%
16371
< 0.1%

Interactions

2022-10-10T16:14:17.012760image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:55.782797image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:58.442542image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:00.324316image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:02.361341image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:04.165318image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:05.964322image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:07.948355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:09.820367image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:11.621315image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:13.424600image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:15.258662image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:17.159797image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:55.977802image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:58.580320image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:00.498321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:02.509331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:04.306348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:06.098351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:08.095319image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:09.961320image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:11.766316image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:13.574610image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:15.393659image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:17.298766image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:56.215822image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:58.723314image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:00.676329image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:02.653321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:04.454348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:06.237319image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:08.243364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:10.103353image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:11.913352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:13.715589image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:15.532807image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:17.446801image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:56.486837image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:58.862314image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:00.858313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:02.798362image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:04.599336image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:06.375318image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:08.409319image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:10.247322image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:12.056317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:13.873594image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:15.677813image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:17.607810image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:56.739859image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:59.009312image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:01.040320image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:02.951347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:04.749313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:06.521321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:08.570314image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:10.401347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:12.207316image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:14.027597image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:15.819763image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:18.063759image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:56.983861image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:59.157313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:01.223313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:03.103317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:04.900315image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:06.669322image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:08.730359image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:10.584322image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:12.361322image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:14.189610image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:15.966797image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:18.208806image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:57.167939image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:59.296319image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:01.406354image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:03.242315image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:05.036355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:06.805320image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:08.870324image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:10.734355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:12.506324image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:14.331646image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:16.114798image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:18.374800image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:57.354962image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:59.447323image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:01.574315image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:03.401317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:05.197351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:06.963326image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:09.032334image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:10.886354image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:12.662317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:14.497610image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:16.268770image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:18.522763image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:57.522957image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:59.588313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:01.723362image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:03.541366image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:05.340349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:07.344381image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:09.179323image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:11.031355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:12.806325image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:14.647609image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:16.414773image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:18.668775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:57.706971image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:59.752317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:01.883320image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:03.696334image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:05.498349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:07.491353image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:09.332318image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:11.177320image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:12.953315image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:14.795617image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:16.567811image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:18.832757image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:57.899990image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:59.929316image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:02.047324image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:03.860338image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:05.664349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:07.648357image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:09.497317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:11.328351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:13.115587image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:14.949659image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:16.723817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:18.980766image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:13:58.290459image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:00.124314image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:02.198351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:04.003313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:05.812316image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:07.795351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:09.654321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:11.466350image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:13.267629image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:15.096663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-10-10T16:14:16.863799image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-10-10T16:14:24.053771image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-10-10T16:14:24.309770image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-10-10T16:14:24.526802image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-10-10T16:14:24.760799image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-10-10T16:14:19.231756image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-10-10T16:14:19.530800image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysinvoice_noavg_quantitytotal_quantityavg_ticketavg_recency_daysfrequencyqty_returnsavg_un_basket_sizestock_code
00178505391.21372.034.05.8350171733.018.1522221 days 00:00:0017.00000040.08.735294297.0
11130473232.5956.09.08.1286551390.018.90403552 days 20:00:000.02830235.019.000000171.0
22125836705.382.015.021.6724145028.028.90250026 days 12:00:000.04032350.015.466667232.0
3313748948.2595.05.015.678571439.033.86607192 days 16:00:000.0179210.05.60000028.0
4415100876.00333.03.026.66666780.0292.00000020 days 00:00:000.07317122.01.0000003.0
55152914623.3025.014.020.6078432102.045.32647126 days 18:27:41.5384615380.04011529.07.285714102.0
66146885630.877.021.011.0733943621.017.21978619 days 06:18:56.8421052630.057221399.015.571429327.0
77178095411.9116.012.033.7213112057.088.71983639 days 16:00:000.03352041.05.08333361.0
881531160767.900.091.016.05464538194.025.5434644 days 04:35:03.3707865160.243316474.026.1428572379.0
99160982005.6387.07.09.149254613.029.93477647 days 16:00:000.0243900.09.57142967.0

Last rows

df_indexcustomer_idgross_revenuerecency_daysinvoice_noavg_quantitytotal_quantityavg_ticketavg_recency_daysfrequencyqty_returnsavg_un_basket_sizestock_code
2764561117290525.243.02.03.960784404.05.14941213 days 00:00:000.1428570.051.0102.0
276556201478577.4010.02.028.00000084.025.8000005 days 00:00:000.3333330.01.53.0
2766562117254272.444.02.02.250000252.02.43250011 days 00:00:000.1666670.056.0112.0
2767563717232421.522.02.05.638889203.011.70888912 days 00:00:000.1538460.018.036.0
2768563817468137.0010.02.023.200000116.027.4000004 days 00:00:000.4000000.02.55.0
2769564913596697.045.02.02.445783406.04.1990367 days 00:00:000.2500000.083.0166.0
27705655148931237.859.02.010.945205799.016.9568492 days 00:00:000.6666670.036.573.0
2771568014126706.137.03.033.866667508.047.0753333 days 00:00:000.75000050.05.015.0
27725686135211092.391.03.01.685057733.02.5112414 days 12:00:000.3000000.0145.0435.0
2773569615060301.848.04.02.183333262.02.5153331 days 00:00:002.0000000.030.0120.0